Identifying key morphometrics to post-storm beach recovery through explainable AI

In the context of ongoing discussions about climate change, the focus on beach resilience has gained significant attention in contemporary studies. However, a comprehensive understanding of beach resilience, particularly in the short term, remains limited. This study utilizes a dataset of 104 storm events in Hasaki beach, located on the East coast of Japan, to investigate the 10-day beach recovery. The study considers four types of distinct beach profile patterns based on sandbar formations. Employing XGBoost and the SHAP explanation method, the influence of morphometric indicators on beach resilience were explored. Resilient beach profiles are anticipated to exhibit rapid recovery following erosional storm events. The results reveal that morphometrics play a crucial role in determining the short-term, 10-day, recovery of beaches, with specific morphometric features demonstrating pronounced effects based on profile patterns. The study contributes to the current knowledge of post-storm beach recovery and provides insights that could inform discussions on beach resilience.

storm-scale recovery of the fore-beach accretion; the initial stage of their post-storm recovery model.Despite the disruptive impact of stronger waves on nearshore morphology, the beach's recovery from such disturbances is facilitated by milder short-period waves in the post-storm period.Surf zone alterations during high wave conditions are primarily attributed to disturbances to the seabed.Instances of high turbulences in the surf zone may induce a seaward net sediment flow, often resulting in the displacement of foreshore sediment further seaward.When examining the beach recovery, a semi-empirical criterion proposed by Sunamura and Horikawa 17 serves to quantify beach response in this context: where H 0 and L 0 are deep-water wave height and length, respectively, tan β is bottom slope, d 50 is grain size, and C s is a dimensionless constant.C s is classified as follows.
Positive values resulting from this calculation indicate erosion.While negative values show beach accretionary conditions.
A dimensionless parameter, K * , proves useful in explaining the stage movement of cross-sectional beach profiles by incorporating wave data and grain sizes 18,19 : where H B is daily average breaking water height, d is the sediment grain size, and T is the wave period.Based on the K * value, profile characteristics range from erosional extremes to accretionary extremes.However, these expressions have been derived empirically, and their validity is limited to the conditioned used to derive them.Therefore, there remains a need for sound theoretical equations to address knowledge gaps and provide a clearer understanding of beach recovery mechanisms.While numerous studies have concentrated on hydrodynamic conditions, only a limited number have considered morphological variables related to sandbar characteristics when formulating mathematical equations to comprehend beach behavior in the post-storm period 12,20 .
Existing post-storm recovery quantifications heavily rely on numerical simulations, which often impose a significant computational burden.However, recent advancements in Machine Learning (ML) algorithms offer a promising alternative, showcasing their ability to grasp intricate natural processes with considerably lower computational costs.While empirical equations, predominantly derived from laboratory experiments, aim to understand inter-correlations between major morphometrics and hydrodynamic conditions, improved ML models with extensive hyperparameters have showcased potential in capturing these intricate natural behaviors.The integration of ML in coastal and ocean engineering is currently flourishing, yielding results suggesting higher accuracies in capturing complex relations [21][22][23][24][25][26][27] .At the same time, recent developments in data observation techniques, such as remote sensing, have generated large-scale datasets across various scientific disciplines.These datasets are instrumental in training ML or Deep Learning (DL) models to serve as local and global interpretations, explaining dependent variables in respective studies.While DL algorithms trained on neural networks excel in capturing natural behaviors, their performance with numerical tabular data is less compelling 28,29 .In this context, XGBoost 30 , a boosting decision tree algorithm, has proven its excellence in discerning complex relations within homogeneous tabular data.A widely discussed drawback of using machine learning tools is their low explainability; however, recent developments in feature importance tools, such as SHapley Additive exPlanations 31 (SHAP), have shown effectiveness in interpreting model results 25,32,33 .
Given the limited discussions on the influence of morphometric variables in beach recovery, the present study aims to fill this gap by setting quantitative objectives and employing innovative ML-driven tools for measurement of the role of morphometrics on recovery.Additionally, while the presence of sandbars significantly shapes beach profiles during recovery, relevant studies are notably scarce.To address this, we examine the resilient behavior of four types of distinct profile patterns based on sandbar formations, a consideration often overlooked in prior studies, whether they be erosional or accretionary.By incorporating sandbar formations into our beach resilient discussions, we expect to strengthen the present understanding of the role of beach morphology in its recovery.

Study site
The Hasaki Oceanographic Research Station (HORS) is situated in the middle of an approximately 16 km long straight sandy beach strip, bounded by Kashima port to the North and Choshi port to the South, in Japan (Fig. 1a).HORS features a 427.0 m long pier extending perpendicularly to the Hasaki coastline.This pier has played a significant role in amassing a valuable dataset documenting morphological and hydrodynamic changes in the Hasaki coast since 1987 34 .All survey and research activities are carried out by the Port and Airport Research Institute (PARI).With a tidal range of 1.45 m and an orientation of 59°counterclockwise from the North, this dataset has been instrumental in numerous studies aiming to comprehend the intricate behavior of nearshore morphodynamics 35 .In this study, we utilized a comprehensive 24-year dataset (from 1987 to 2010) of crossshore beach profile measurements and concurrent offshore wave measurements to identify storm events and subsequently conducted statistical analyses to quantify the influence of key morphometrics on beach recovery. (1)

Wave data and extraction of storms
The mean significant wave height and period over the 24-year period were 1.20 m and 7.87 s, with standard deviations of 0.61 m and 1.91 s, respectively.A graphical illustration showing the variation of annual wave conditions from 1987 to 2010 can be found as Supplementary Fig. S1 online.A site-specific, significant wave height, H S , -based storm identification approach was employed as proposed by Thilakarathne et al. 33 .The criteria involved a minimum H S value of 2.5 m and a minimum duration of 6 h, with H S exceeding 1.5 m being considered as the storm duration 13,39 (Fig. 2).Utilizing the hourly H S data from HORS, a Python module was developed to identify storm events, resulting in 347 storm events.Following the initial identification, the selected storm events underwent additional refinement to exclude consecutive storm occurrence within 10 days.This step was crucial to avoid potential disturbances to the beach recovery process.Additionally, negative erosional storm cases  Storm identification based on the two threshold approach.The threshold of 2.5 m was employed to identify storm events and 1.5 m was employed to calculated the storm duration.Shoreline variation with the presence of storms and recovery in post-storm period are also shown here.Consecutive storm occurrence damaging the shoreline recovery is visible in September, 2007.

Beach resilience number
The beach resilience number (BRN) was introduced as a metric to quantify the recovery potential of diverse beach profile patterns.Eichentopf et al. 41 employed an 18-day post-storm period to detect consecutive storm occurrences at Hasaki.We initially considered 7-day, 10-day, 14-day, and 21-day intervals and compared the occurrence of sequential storms and shoreline recovery measurements for a better statistical approach.We then chose a 10-day threshold, considering the observed recovery patterns at Hasaki.This 10-day selection provided an undisturbed period from consecutive storm occurrences and a significant dataset for our statistical analysis of beach resilience.BRN is defined as the ratio between the shoreline recovery after 10 days and the shoreline erosion during the storm event: Here, BR represents the beach recovery after 10 days (measured from the post-storm shoreline position), and dSL signifies shoreline erosion after the corresponding storm event (measured from the pre-storm shoreline position).For example a BRN value of 1 indicates full recovery of the shoreline to initial state.Additionally, a negative BRN signifies negative recovery, indicating further instances of erosion during the post-storm period.
Since the BRN is a measure of beach recovery relative to shoreline erosion, excessively large BRN values might not provide meaningful insights.Consequently, outliers were identified and removed using Interquartile Range Technique 42 from the BRN dataset as shown in Fig. 3, resulting in a refined dataset of 104 storm cases for the beach resilience analysis.

Summary of morphometric and hydrodynamic indicators
Table 1 provides a comprehensive summary of all the morphometric and hydrodynamic indicators examined in the feature importance analysis.The careful selection of these indicators aimed to represent profile characteristics spanning from the backshore area to offshore sandbar formations.In addition, due consideration was given to past research on crucial morphometrics.While the sediment size, d 50 , was omitted, the foreshore slope is expected to serve as a representative indicator of sedimentologic characteristics.Fig. 1b shows inner and outer zone definitions based on the foreshore boundary, average breaking point, and closure depth at the study site.www.nature.com/scientificreports/Extracted from daily profile data, selected sandbar characteristics are illustrated in Fig. 1c, providing a clear outline of the definitions employed in this study.

Feature importance analysis
In the present study, we conducted a feature importance analysis using a combination of XGBoost 30 models and the SHapley Additive exPlanations (SHAP) algorithm 31 .XGBoost 30 is a widely used boosting tree-based machine-learning algorithm.SHAP 31 is a game theory-based feature explanation method, represented by Eq. ( 4): where, i denotes the Shapley value assigned to feature i.The term E[f (x)|x S R i ∪i ] signifies the expected model output for the R case when feature i is included in the model prediction, while ] signifies the expected model output for the R case when feature i is excluded from the model prediction.This process of inclusion and exclusion for each indicator forms the foundation of SHAP for quantifying feature importance.Moreover, M corresponds to the total count of potential feature combinations, coalitions.This methodology aligns with the approach employed by Thilakarathne et al. 33 , which focused on quantifying feature importance in beach susceptibility.
Initially, we trained four separate XGBoost models using 18 indicators (Table 1) as inputs and the Beach Resilience Number (BRN) as the output.These trained models formed the basis for explaining the influence of each indicator, encompassing 14 morphometric and 4 hydrodynamic variables, on BRN, and consequently, on beach recovery.With the trained models in hand, we employed the SHAP algorithm to explain both local and global interpretations of feature importance.Local interpretations provide insights into how each input variable influences individual predictions, while global interpretations provide a broader perspective on the overall impact of an indicator by averaging the absolute SHAP values across a particular dataset.In the present study, we have four datasets based on the profile pattern.Thus, global interpretation values are calculated as the average of the absolute local interpretations across the profiles within each of the four profile patterns.
As the first phase of feature importance analysis, based on the global interpretation values for each indicator, we selected key indicators with a significant influence on beach recovery, quantified as BRN in this study.This initial selection phase aimed to identify the most impactful indicators, guiding our subsequent analyses.Subsequently, we refined our analysis by training new four XGBoost models for each of the profile patterns using only the selected significant indicators as inputs.These refined models enabled us to investigate deeper into the specific contributions of each selected indicator to the output variable, BRN.By applying the SHAP algorithm once again to these models, we gained further insights into the relative importance of each indicator in predicting the output variable, both locally and globally.This employed feature importance analysis provided a comprehensive understanding of the role played by different indicators in deciding the beach recovery in the 10-day post-storm period. (4) Table 1.Summary of the morphometric and post-storm 10-day hydrodynamic indicators employed in the present beach study.Clear definitions of the morphometric indicators are graphically illustrated in Fig. 1b, c.

Feature importance analysis
Normalized SHAP global interpretation values for each indicator across the four profile patterns are shown in Fig. 4.These values were normalized by dividing the sum of their global interpretation, allowing for comparisons within a profile pattern but not across different patterns.The results reveal that key indicators vary depending on the profile pattern, underscoring the influence of the beach profile transformation stage on recovery.Profile patterns signify distinct stages of beach transformation, aligning with the discussions by Sunamura et al. 17 .The inclusion of the last four indicators in our field data captures hydrodynamic conditions in the 10-day post-storm period.In contrast, Phillips et al. 43 used correlation coefficients to assess the impacts of sandbar location and the rate of cross-shore sandbar migration but considered longer recovery periods of 7, 14, 30, and 60 days, including the presence of consecutive storm occurrences.However, our study, focusing on specific morphometric indicators, was constrained by the presence of consecutive storm occurrences, limiting the post-storm period duration to 10-days.Also, in the initial stage of our study, we tried to use correlation coefficients to explain complex nearshore processes and they were not statistically and theoritically satisfying.The first phase of the feature importance analysis aimed to identify critical indicators in each profile pattern for use in the subsequent local interpretation phase.Utilizing a 0.1 threshold for each profile pattern yielded distinct indicators for different profiles.Out of the 18 indicators, eight were selected based on their global interpretations concerning BRN.Specifically, for unbarred profiles, the chosen indicators included shoreline position, outer zone sediment volume, mean and standard deviation of wave height in the post-storm period.Inner sandbar profiles featured selected indicators of outer zone sediment volume and standard deviation of wave height in the post-storm period.Outer sandbar profiles were characterized by one morphometric indicator, the seaward slope of the outer sandbar, and the mean value of wave period in the post-storm period.Finally, double sandbar profiles only exhibited one indicator significantly related to beach resilience, which was the outer sandbar depth.
In the second phase of the feature importance analysis, using the selected 8 indicators as inputs, we focused on local interpretations to showcase the variation of importance of each selected indicator in each post-storm recovery case, which was decided based on the post-storm profile pattern.Local interpretations explain the model's predictions for individual inputs, hence the variation of indicator importance with indicator value can be quantified.
Figure 5 33 .blue contribute positively to beach recovery due to positive SHAP values.In unbarred and inner sandbar profiles, wave conditions had a significant influence on recovery, with severe wave conditions characterized by higher wave heights and periods in the 10 days resulting in negative SHAP values, signifying reduced BRN values and lower resiliency in beach profiles.

Hydrodynamic condition on beach recovery
Recognizing the significant influence of hydrodynamic variables on nearshore morphology changes, we assessed the impact of wave action during the 10-day post-storm period on beach recovery, as quantified by BRN.The last four indicators among the 18 considered represented the wave condition for each profile pattern.The mean significant wave height ( H S ) values for unbarred, inner sandbar, outer sandbar, and double sandbar profiles were 1.16 m, 1.12 m, 1.09 m, and 1.00 m, respectively.Similarly, mean wave period ( T S ) values were 7.90 s, 7.85 s, 7.98 s, and 7.43 s across these four profiles.Among the 104 post-storm cases, 32 exhibited mean H S values exceeding the long-term average, while 25 displayed negative BRN values, indicating further erosional cases.Interestingly, only 12 of these 25 cases had higher H S values than the long-term average, suggesting that wave conditions alone do not determine post-storm changes.However, when the mean H S during the post-storm period exceeded 1.60 m, negative BRNs were more prevalent.Furthermore, it was observed that the majority of positive BRN cases, 78.48 %, occurred under milder wave conditions, reinforcing our approach to exclude consecutive storm occurrences.
A graphical illustration showing the distribution of wave height ( H S ) and wave period ( T S ) for the complete dataset, storm conditions, and post-storm periods for each of the profile patterns can be found as Supplementary Fig. S2 online.

Profile patterns on beach recovery
In Fig. 5a, the local interpretations of each selected indicator for the beach recovery (BRN) of unbarred profiles after 16 storm cases are presented, focusing on how individual post-storm beach morphometrics contribute to beach recovery, either positively or negatively.In the absence of a sandbar, the probability of wave breaking significantly diminishes, allowing larger waves to directly impact the foreshore.This, in turn, hinders post-storm recovery, resulting in less resilient beach profiles.Key morphometrics controlling recovery include shoreline and outer zone sediment volume, where higher values exhibit negative SHAP values, indicating obstructing the recovery.Specifically, the standard deviation of H S , h d , emerges as the most important indicator, shadowing the impact of morphometrics in this context.The accumulation of sediment in the outer zone contributes to recovery during milder post-storm periods.The variation of SHAP values for different shoreline positions (sl) in Fig. a shows that larger shoreline values (seaward shoreline positions) lead to relatively less resilient profiles.Although the magnitudes of SHAP values are not cogent, this suggests that largely accreted profiles are more prone to erosion even under average wave conditions, indicating lower resilience.Unbarred profiles often represent the endpoint of long-term sandbar migration, where most of the sediment is located on the beach, making the beach zone sediment-rich but more susceptible to erosion.
It is needed to highlight that the landward migration process of the sandbar during the spring plays a critical role in ensuring a complete subaerial beach recovery over the mild wave period in the summer, particularly in the context of summer profiles with larger sediment accretions, mostly resulting in unbarred profiles.As highlighted by Ruiz de Alegría-Arzaburu and Vidal-Ruiz 44 , the recovery period is significantly influenced by seasonal wave climates, a factor not directly addressed in this study.
In Fig. 5b, the local interpretations of each selected indicator for the beach recovery (BRN) of inner sandbar profiles after 27 storm cases are presented.These local interpretations signify the contribution of each selected indicator to BRN, indicating that, similar to the unbarred profiles, wave conditions also exert significant control over recovery for inner sandbar profile patterns.Thus, for this dataset, beach recovery appears to be influenced by the wave condition during the post-storm period.The outer zone sediment volume, o v , emerges as the only morphometric indicating its importance on short-term beach recovery.Although a higher sediment budget in the outer sandbar, o v , appears to impede recovery during the post-storm period, unclear distributions of SHAP values hinder drawing definitive conclusions from this evidence.However, the impact of wave conditions is evident, as higher wave height standard deviations during the post-storm period prevent recovery.
Figure 5c shows the local interpretations of each selected indicator for the beach recovery (BRN) of outer sandbar profiles after 40 storm cases.Outer sandbar profiles were the predominant category among the 104 datasets for post-storm recovery indicators, offering substantial and satisfactory data for conducting the feature importance analysis.Unlike the other three patterns, a clear impact of significant wave height, H s , on the BRN was absent.The variation of the only hydrodynamic indicator, T S , with the SHAP values did not show a strong correlation, and its mean during the post-storm period of outer sandbar profile patterns (7.98 s) was not significantly different from the long-term mean (7.87 s).
When considering morphometric indicators, the seaward slope, o ss , and foreshore slope, sp, appeared to exert a significant impact on beach recovery, hence, the predominant characteristics of beach resilience.A clear impact was observed between higher o ss and beach recovery, making it the most important morphometric indicator for outer sandbar profiles.This observation is evident in the first row of Fig. 5c, which shows positive SHAP values for higher seaward slopes of outer sandbars.The impact of sandbar slopes on recovery, and beach morphodynamics in general, is an area that has not received significant attention.However, our study considered this aspect and found that the seaward slope of sandbars plays an important role in recovery for outer zone sandbar profile patterns.The local interpretations were clear: steeper sandbars have a positive impact on recovery, while milder slopes have the opposite effect.This means that outer sandbars with steeper slopes are less stable, more prone to erosion, and provide a higher sediment supply, facilitating quicker recovery.The impact of the other chosen morphometric indicator, sp, identified in the initial phase of the feature importance analysis, appeared to have a lesser effect on BRN, with SHAP values ranging between −0.5 to 0.5.We included it as an indicator, especially as it represents the sedimentological characteristics of the beach and surf zone.Various discussions on the relationship between foreshore slope and sediment size are available 34,37,45,46 .Reis and Gama 47 have presented a correlation between these two factors based on the constructal law: where a is a wave parameter, B is related to the water height, and d is the sediment diameter.This relationship indicates that higher slopes represent steeper beaches, a claim also made by Wright et al. 1 .
Figure 5d shows the local interpretations of the only indicator selected for beach recovery, BRN, of double sandbar profiles after 21 storm cases.In the global interpretation shown in Fig. 4d, the water depth at the outer sandbar, o d , emerged as significantly important in controlling beach recovery.Shallower water depths at the outer sandbar suggest that waves break even for lower wave heights during the recovery period, disturbing the outer zone sandbar and bringing sediment to the beach under milder wave conditions, facilitating easier recovery.With normalized importance exceeding 0.5, o d was identified as the sole indicator for double sandbar formations, given that these profiles encompass all 18 indicators.Such a high global interpretation importance, surpassing 50%, provides statistically robust evidence.
Upon closer examination of the local interpretations of o d for the recovery mechanism, it became apparent that lower water depths were associated with quicker recoveries in the beach system, indicative of more resilient beach profiles.However, the impact of larger water depths on recovery was less clear, with higher values showing negative SHAP values (Fig. 5d).Lower water depths at the outer sandbar profiles facilitate sediment transport,

Conclusion and future work
In the present study, we conducted a detailed statistical analysis of the short-term beach recovery process using a dataset of 104 storm events, with a focus on understanding how recovery varies based on different beach profile patterns.Building upon our previous work 33 , which concentrated on beach susceptibility, we employed several methodologies established in that study to identify storms, profile patterns, and cross-shore zones.Investigating 14 morphometric parameters within four distinct profile patterns (unbar, inner, outer, and double sandbar profiles), in the present study, we aimed to explain and interpret their impact on short-term recovery dynamics.Key influential morphometric features for each of the four profile patterns were identified as shoreline position and outer zone sediment volume significantly for unbarred profiles, outer zone sediment volume for inner sandbar profiles, seaward sandbar slope and beach slope for outer sandbar profiles, and water depth at the outer sandbar for double sandbar profiles.Most importantly, our findings underscored the significant contribution of beach profile patterns to the recovery process, as these patterns represent different stages in the long-term transformation of beach profiles.This study addresses a significant gap in the literature by providing a comprehensive analysis of short-term beach recovery.
Results further highlight the role of wave conditions in controlling beach recovery following extreme storm events.We hypothesize that the insights gained from this study can serve as a foundation for future research endeavors, providing a basis for understanding and analyzing beach resilience variations through the proposed Beach Resilient Number (BRN).Moreover, our observations indicate that water depth at the outer sandbar crest plays a crucial role in governing the recovery process.This aligns with the established understanding that the outer sandbar acts as a sediment source, expediting beach recovery.Additionally, the seaward slope of the outer sandbar emerges as a key factor influencing recovery; steeper slopes appear to yield the beach more fragile, yet contribute to sediment availability for recovery.Therefore, the combination of low water depths and higher seaward slopes contributes to overall beach profile resilience.
While the present study examines the importance of considering beach profile patterns in recovery processes, certain limitations provide insights for future research recommendations.The analysis employed a relatively small dataset size due to the storm data filtering conditions, including requirements for the absence of consecutive storm occurrences and significant erosions during the storms.To enhance the robustness of the proposed BRN and methodology, we recommend the use of larger datasets for validation in future work.Furthermore, we believe that the methods developed in this study are transferrable to various sandy beaches, as evidenced by our satisfactory apprehension of beach recovery dynamics even with a relatively modest dataset.Future research can build upon these insights to advance our understanding of beach recovery and enhance the applicability of the proposed methodologies across diverse coastal environments.

Figure 1 .
Figure 1.(a) Location of the Hasaki Oceanographic Research Station (HORS) in the Ibaraki prefecture, Japan.(b) Beach profile immediately after a storm event and after a 10-day period, where shoreline changes based on the high water level (HWL) are considered as the beach recovery.Inner and outer zone definitions are based on the foreshore boundary and average wave breaking location.(c) Sandbar identification based on trough and bar formation, with sandbar indicators defined based on this figure.

Figure 2 .
Figure 2.Storm identification based on the two threshold approach.The threshold of 2.5 m was employed to identify storm events and 1.5 m was employed to calculated the storm duration.Shoreline variation with the presence of storms and recovery in post-storm period are also shown here.Consecutive storm occurrence damaging the shoreline recovery is visible in September, 2007.

Figure 3 .
Figure 3. Distribution of BRN values for the 125 storm dataset selected after filtering out consecutive storm occurrences and accretionary cases.The Interquartile Range Technique was employed to detect outliers in the Beach Recovery Number (BRN) distribution for the final storm data filtering, resulting in the selection of the final 104 storm dataset for analysis.

Figure 4 .
Figure 4. First phase of feature importance analysis using global interpretations of the indicators for the Beach Recovery Number values.Sub figures denote: (a) unbarred profiles, (b) inner sandbar profiles, (c) outer sandbar profiles, and (d) double sandbar profiles.A threshold of 0.1 was used to identify the significant indicators as proposed by Thilakarathne et al. 33 .

Figure 5 .
Figure 5. Second phase of feature importance analysis using local interpretations of the indicators for the beach recovery number values.Sub figures denote: (a) unbarred profiles, (b) inner sandbar profiles, (c) outer sandbar profiles, and (d) double sandbar profiles.Each color bar denotes the range of indicator values, with blue hues representing smaller values and red hues indicating larger values.